Biblioteca Digital

911 resultados para multivariate discriminant analysis

Predicting class I major histocompatibility complex (MHC) binders using multivariate statistics:comparison of discriminant analysis and multiple linear regression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.

Maize kernel hardness classification by near infrared (NIR) hyperspectral imaging and multivariate data analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of near infrared (NIR) hyperspectral imaging and hyperspectral image analysis for distinguishing between hard, intermediate and soft maize kernels from inbred lines was evaluated. NIR hyperspectral images of two sets (12 and 24 kernels) of whole maize kernels were acquired using a Spectral Dimensions MatrixNIR camera with a spectral range of 960-1662 nm and a sisuChema SWIR (short wave infrared) hyperspectral pushbroom imaging system with a spectral range of 1000-2498 nm. Exploratory principal component analysis (PCA) was used on absorbance images to remove background, bad pixels and shading. On the cleaned images. PCA could be used effectively to find histological classes including glassy (hard) and floury (soft) endosperm. PCA illustrated a distinct difference between glassy and floury endosperm along principal component (PC) three on the MatrixNIR and PC two on the sisuChema with two distinguishable clusters. Subsequently partial least squares discriminant analysis (PLS-DA) was applied to build a classification model. The PLS-DA model from the MatrixNIR image (12 kernels) resulted in root mean square error of prediction (RMSEP) value of 0.18. This was repeated on the MatrixNIR image of the 24 kernels which resulted in RMSEP of 0.18. The sisuChema image yielded RMSEP value of 0.29. The reproducible results obtained with the different data sets indicate that the method proposed in this paper has a real potential for future classification uses.

On classical and other methods of discriminant analysis and estimation of log-odds

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For two multinormal populations with equal covariance matrices the likelihood ratio discriminant function, an alternative allocation rule to the sample linear discriminant function when n1 ≠ n2 ,is studied analytically. With the assumption of a known covariance matrix its distribution is derived and the expectation of its actual and apparent error rates evaluated and compared with those of the sample linear discriminant function. This comparison indicates that the likelihood ratio allocation rule is robust to unequal sample sizes. The quadratic discriminant function is studied, its distribution reviewed and evaluation of its probabilities of misclassification discussed. For known covariance matrices the distribution of the sample quadratic discriminant function is derived. When the known covariance matrices are proportional exact expressions for the expectation of its actual and apparent error rates are obtained and evaluated. The effectiveness of the sample linear discriminant function for this case is also considered. Estimation of true log-odds for two multinormal populations with equal or unequal covariance matrices is studied. The estimative, Bayesian predictive and a kernel method are compared by evaluating their biases and mean square errors. Some algebraic expressions for these quantities are derived. With equal covariance matrices the predictive method is preferable. Where it derives this superiority is investigated by considering its performance for various levels of fixed true log-odds. It is also shown that the predictive method is sensitive to n1 ≠ n2. For unequal but proportional covariance matrices the unbiased estimative method is preferred. Product Normal kernel density estimates are used to give a kernel estimator of true log-odds. The effect of correlation in the variables with product kernels is considered. With equal covariance matrices the kernel and parametric estimators are compared by simulation. For moderately correlated variables and large dimension sizes the product kernel method is a good estimator of true log-odds.

Geostatistical Fisher discriminant analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A geostatistical version of the classical Fisher rule (linear discriminant analysis) is presented.This method is applicable when a large dataset of multivariate observations is available within a domain split in several known subdomains, and it assumes that the variograms (or covariance functions) are comparable between subdomains, which only differ in the mean values of the available variables. The method consists on finding the eigen-decomposition of the matrix W-1B, where W is the matrix of sills of all direct- and cross-variograms, and B is the covariance matrix of the vectors of weighted means within each subdomain, obtained by generalized least squares. The method is used to map peat blanket occurrence in Northern Ireland, with data from the Tellus
survey, which requires a minimal change to the general recipe: to use compositionally-compliant variogram tools and models, and work with log-ratio transformed data.

Determination of disease biomarkers in Eucalyptus by comprehensive two-dimensional gas chromatography and multivariate data analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper is reported the use of the chromatographic profiles of volatiles to determine disease markers in plants - in this case, leaves of Eucalyptus globulus contaminated by the necrotroph fungus Teratosphaeria nubilosa. The volatile fraction was isolated by headspace solid phase microextraction (HS-SPME) and analyzed by comprehensive two-dimensional gas chromatography-fast quadrupole mass spectrometry (GC. ×. GC-qMS). For the correlation between the metabolic profile described by the chromatograms and the presence of the infection, unfolded-partial least squares discriminant analysis (U-PLS-DA) with orthogonal signal correction (OSC) were employed. The proposed method was checked to be independent of factors such as the age of the harvested plants. The manipulation of the mathematical model obtained also resulted in graphic representations similar to real chromatograms, which allowed the tentative identification of more than 40 compounds potentially useful as disease biomarkers for this plant/pathogen pair. The proposed methodology can be considered as highly reliable, since the diagnosis is based on the whole chromatographic profile rather than in the detection of a single analyte. © 2013 Elsevier B.V..

Identification of species of the Euterpe genus by rare earth elements using inductively coupled plasma mass spectrometry and linear discriminant analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Discriminant analysis of neuromuscular variables in chronic low back pain

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Investigation and discrimination of neuromuscular variables related to the complex aetiology of low back pain could contribute to clarifying the factors associated with symptoms. Objective: Analysing the discriminative power of neuromuscular variables in low back pain. Methods: This study compared muscle endurance, proprioception and isometric trunk assessments between women with low back pain (LBP, n=14) and a control group (CG, n=14). Multivariate analysis of variance and discriminant analysis of the data were performed. Results: The muscle endurance time (s) was shorter in the LBP group than in the CG (p=0.004) with values of 85.81 (37.79) and 134.25 (43.88), respectively. The peak torque (Nm/kg) for trunk extension was 2.48 (0.69) in the LBP group and 3.56 (0.88) in the GG (p=0.001); for trunk flexion, the mean torque was 1.49 (0.40) in the LBP group and 1.85 (0.39) in the CG (p=0.023). The repositioning error (degrees) before the endurance test was 2.66 (1.36) in the LBP group and 2.41 (1.46) in the CG (p=0.664), and after the endurance test, it was 2.95 (1.94) in the LBP group and 2.00 (1.16) in the CG (p=0.06). Furthermore, the variables showed discrimination between the groups (p=0.007), with 78.6% of the individuals with low back pain correctly classified in the LBP group. In turn, variables related to muscle activation showed no difference in discrimination between the groups (p=0.369). Conclusion: Based on these findings, the clinical management of low back pain should consist of both resistance and strength training, particularly in the extensor muscles.

Quantitative Chemical Profile and Multivariate Statistical Analysis of Alembic Distilled Sugarcane Spirit Fractions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Concentrations of 39 organic compounds were determined in three fractions (head, heart and tail) obtained from the pot still distillation of fermented sugarcane juice. The results were evaluated using analysis of variance (ANOVA), Tukey's test, principal component analysis (PCA), hierarchical cluster analysis (HCA) and linear discriminant analysis (LDA). According to PCA and HCA, the experimental data lead to the formation of three clusters. The head fractions give rise to a more defined group. The heart and tail fractions showed some overlap consistent with its acid composition. The predictive ability of calibration and validation of the model generated by LDA for the three fractions classification were 90.5 and 100%, respectively. This model recognized as the heart twelve of the thirteen commercial cachacas (92.3%) with good sensory characteristics, thus showing potential for guiding the process of cuts.

Sparse Linear Discriminant Analysis for Simultaneous Testing for the Significance of a Gene Set/Pathway and Gene Selection

Relevância:

100.00% 100.00%

Publicador:

Discriminant analysis of the specialty of elite cyclist

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The different demands of competition coupled with the morphological and physiological characteristics of cyclists have led to the appearance of cycling specialities. The aims of this study were to determine the differences in the anthropometric and physiological features in road cyclists with different specialities, and to develop a multivariate model to classify these specialities and predict which speciality may be appropriate to a given cyclist. Twenty male, elite amateur cyclists were classified by their trainers as either flat terrain riders, hill climbers, or all-terrain riders. Anthropometric and cardiorespiratory studies were then undertaken. The results were analysed by MANOVA and two discriminant tests. Most differences between the speciality groups were of an anthropometric nature. The only cardiorespiratory variable that differed significantly (p < 0.05) was maximum oxygen consumption with respect to body weight (VO2max/kg). The first discriminant test classified 100% of the cyclists within their true speciality; the second, which took into account only anthropometric variables, correctly classified 75%. The first discriminant model allows the likely speciality of still non-elite cyclists to be predicted from a small number of variables, and may therefore help in their specific training.

Discriminant Analysis of Geographical Origin of Cork Planks and Stoppers by Near Infrared Spectroscopy

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this study was to assess the potential of visible and near infrared spectroscopy (VIS+NIRS) combined with multivariate analysis for identifying the geographical origin of cork. The study was carried out on cork planks and natural cork stoppers from the most representative cork-producing areas in the world. Two training sets of international and national cork planks were studied. The first set comprised a total of 479 samples from Morocco, Portugal, and Spain, while the second set comprised a total of 179 samples from the Spanish regions of Andalusia, Catalonia, and Extremadura. A training set of 90 cork stoppers from Andalusia and Catalonia was also studied. Original spectroscopic data were obtained for the transverse sections of the cork planks and for the body and top of the cork stoppers by means of a 6500 Foss-NIRSystems SY II spectrophotometer using a fiber optic probe. Remote reflectance was employed in the wavelength range of 400 to 2500 nm. After analyzing the spectroscopic data, discriminant models were obtained by means of partial least square (PLS) with 70% of the samples. The best models were then validated using 30% of the remaining samples. At least 98% of the international cork plank samples and 95% of the national samples were correctly classified in the calibration and validation stage. The best model for the cork stoppers was obtained for the top of the stoppers, with at least 90% of the samples being correctly classified. The results demonstrate the potential of VIS + NIRS technology as a rapid and accurate method for predicting the geographical origin of cork plank and stoppers

Statnote 29:discriminant analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Discriminant analysis (also known as discriminant function analysis or multiple discriminant analysis) is a multivariate statistical method of testing the degree to which two or more populations may overlap with each other. It was devised independently by several statisticians including Fisher, Mahalanobis, and Hotelling ). The technique has several possible applications in Microbiology. First, in a clinical microbiological setting, if two different infectious diseases were defined by a number of clinical and pathological variables, it may be useful to decide which measurements were the most effective at distinguishing between the two diseases. Second, in an environmental microbiological setting, the technique could be used to study the relationships between different populations, e.g., to what extent do the properties of soils in which the bacterium Azotobacter is found differ from those in which it is absent? Third, the method can be used as a multivariate ‘t’ test , i.e., given a number of related measurements on two groups, the analysis can provide a single test of the hypothesis that the two populations have the same means for all the variables studied. This statnote describes one of the most popular applications of discriminant analysis in identifying the descriptive variables that can distinguish between two populations.

Application of Regularized Discriminant Analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62H30, 62P99

Non-invasive, quantitative analysis of drug mixtures in containers using spatially offset Raman spectroscopy (SORS) and multivariate statistical analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, spatially offset Raman spectroscopy (SORS) is demonstrated for non-invasively investigating the composition of drug mixtures inside an opaque plastic container. The mixtures consisted of three components including a target drug (acetaminophen or phenylephrine hydrochloride) and two diluents (glucose and caffeine). The target drug concentrations ranged from 5% to 100%. After conducting SORS analysis to ascertain the Raman spectra of the concealed mixtures, principal component analysis (PCA) was performed on the SORS spectra to reveal trends within the data. Partial least squares (PLS) regression was used to construct models that predicted the concentration of each target drug, in the presence of the other two diluents. The PLS models were able to predict the concentration of acetaminophen in the validation samples with a root-mean-square error of prediction (RMSEP) of 3.8% and the concentration of phenylephrine hydrochloride with an RMSEP of 4.6%. This work demonstrates the potential of SORS, used in conjunction with multivariate statistical techniques, to perform non-invasive, quantitative analysis on mixtures inside opaque containers. This has applications for pharmaceutical analysis, such as monitoring the degradation of pharmaceutical products on the shelf, in forensic investigations of counterfeit drugs, and for the analysis of illicit drug mixtures which may contain multiple components.

On the application of the probabilistic linear discriminant analysis to face recognition across expression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Facial expression is one of the main issues of face recognition in uncontrolled environments. In this paper, we apply the probabilistic linear discriminant analysis (PLDA) method to recognize faces across expressions. Several PLDA approaches are tested and cross-evaluated on the Cohn-Kanade and JAFFE databases. With less samples per gallery subject, high recognition rates comparable to previous works have been achieved indicating the robustness of the approaches. Among the approaches, the mixture of PLDAs has demonstrated better performances. The experimental results also indicate that facial regions around the cheeks, eyes, and eyebrows are more discriminative than regions around the mouth, jaw, chin, and nose.

«
1
2
3
4
5
6
7
8
...
60
61
»